Building Text-to-Speech Systems for Resource Poor Languages
نویسندگان
چکیده
The focus of this research is to develop a method for building Text to Speech Systems for resource poor languages by using data from other languages to fine tune a general template polyglot TTS architecture. Our method involves three main componants: language clustering, phoneme mappings and prosody modelling. As a proof of concept, four TTS have been implemented for English, Spanish, Malay and Iban as follows.
منابع مشابه
Technique for automatic sentence level alignment of long speech and transcripts
A frugal approach to construct speech corpora, specially for resource deficient languages, is to exploit collections of speech and corresponding text data available in audio books, news, lectures. However, using these resources for building speech corpora require an alignment of the long duration speech data with the accompanying text data. Existing techniques for automatic speech-text alignmen...
متن کاملTagMiner: A Semisupervised Associative POS Tagger Effective for Resource Poor Languages
We present here, TagMiner, a data mining approach for part-of-speech (POS) tagging, an important Natural language processing (NLP) classification task. It is a semi-supervised associative classification method for POS tagging. Existing methods for building POS taggers require extensive domain and linguistic knowledge and resources. Our method uses combination of a small POS tagged corpus and a ...
متن کاملOpen-Source Consumer-Grade Indic Text To Speech
Open-source text-to-speech (TTS) software has enabled the development of voices in multiple languages, including many high-resource languages, such as English and European languages. However, building voices for low-resource languages is still challenging. We describe the development of TTS systems for 12 Indian languages using the Festvox framework, for which we developed a common frontend for...
متن کاملP R O N U N C I at I O N M O D E L I N G F
Natural and intelligible Text to Speech (TTS) systems exist for a number of languages in the world today. However, there are many languages of the world, for which building TTS systems is still prohibitive, due to the lack of linguistic resources and data. Some of these languages are spoken by a large population of the world. Others are primarily spoken languages, or languages with large non-li...
متن کاملExperiments with Unit Selection Speech Databases for Indian Languages
This paper presents a brief overview of unit selection speech synthesis and discuss the issues relevant to the development of voices for Indian languages. We discuss a few perceptual experiments conducted on Hindi and Telugu voices. 1 Role of Language Technologies Most of the Information in digital world is accessible to a few who can read or understand a particular language. Language technolog...
متن کامل